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1 .  Introduction 

This  final  report  presents  all  work  that  the  Silvus-UCLA  team  have  conducted  on  AFOSR  STTR 
Contract  FA9950-05-C-0103,  TITLE:  Throughput  Optimization  via  Adaptive  MIMO  communications. 
This  report  includes  and  supersedes  all  previous  quarterly  progress  reports  and  includes  the  work 
completed  in  the  last  performance  period  from  April  1st  2006  to  May  31st  2006.  The  work  performed  is 
divided  into  the  following  five  categories,  each  of  which  will  be  reported  in  a  separate  section. 

•  End-to-end  matlab  packet  simulation  platform. 

•  Low  density  parity  check  code  (LDPCC). 

•  Field  trials  with  Silvus  DSP  MIMO  testbed. 

•  High  mobility  extension. 

•  Preparation  for  FPGA  real-time  implementation. 

To  be  self  contained,  the  report  starts  with  a  brief  problem  statement  and  our  approach. 

Identification  and  Significance  of  the  Problem 

Since  the  early  work  of  Foschini  and  Gans  [6]  [7]  and  Teletar  [1 1]  on  the  capacity  of  multiple  antenna 
radio  (MAR),  a  great  body  of  work  has  shown  the  potential  for  MIMO  based  communications  to  deliver 
unprecedented  spectral  efficiency  in  multi-path  rich  environments.  To  a  large  extent  these  studies  have 
been  theoretical  and  simulation  based  [12]  [13].  A  few  experimental  MIMO  systems  have  also  been 
reported  in  the  literature,  including  some  from  the  members  of  the  Silvus  team  [14][8][9].  However,  these 
trials  have  been  mostly  limited  to  controlled  environments,  mostly  indoors,  but  a  few  outdoor  mobile 
environments  as  well. 

The  application  of  MIMO  communications  to  the  needs  of  UAV  based  communications  is  not  fully 
understood.  Indeed  a  UAV  borne  MIMO  link  poses  several  unique  challenges  both  from  an 
algorithmic/protocol  point  of  view,  as  well  as  a  hardware  architecture  point  of  view.  For  a  UAV  based 
system  operating  in  dense  urban  environments,  whether  below  or  above  the  building  clutter,  the  following 
unique  challenges  exist: 


•  A  highly  dynamic  channel  due  to  the  high  mobility  of  the  UAV, 

•  A  channel  whose  capacity  and  degrees  of  freedom  will  vary  significantly  with  UAV  altitude, 

•  A  diversity  of  mission  requirements  including 

o  Use  in  reconnaissance  missions  behind  enemy  lines  (10s  of  Kbps) 
o  Use  as  a  stove-pipe  relay  node  (a  few  Mbps) 
o  Use  as  an  element  of  a  backbone  node  (100s  of  Mbps), 

•  A  diversity  of  the  UAV  platforms  in  use  by  the  Air  Force, 

•  A  potential  deployment  in  a  mobile  ad-hoc  network  (MANET). 

The  high  mobility  and  changing  characteristics  of  the  UAV  channel  with  altitude  imply  that  the  radio 
system  must  be  adaptive  in  time  and  adaptive  in  space  to  exploit  these  characteristics.  The  diversity  of 
mission  requirements  implies  that  the  radio  must  adapt  the  bandwidth,  the  spectral  efficiency,  and  the 
carrier  frequency  to  be  successful  in  all  scenarios.  In  addition,  the  ability  to  be  used  in  a  variety  of  UAV 
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platforms  and  to  easily  integrate  into  a  mobile  ad-hoc  network  (MANET)  environment  mandates  a  radio 
system  that  is  both  hardware  and  protocol  adaptive. 

The  work  will  culminate  in  a  complete  system  level  design  as  well  as  a  scalable  hardware  architecture 
that  will  allow  the  proposed  MIMO  system  to  scale  with  the  capabilities  of  the  UAV  and  the  mission 
requirements. 


2.  Approach 

This  section  outlines  our  general  approach  to  addressing  the  needs  of  the  UAV  based  MIMO 
communications  system. 

High  throughput  communications  will  be  addressed  by  realizing  that  MIMO  techniques,  although 
quite  powerful,  are  only  an  element  of  an  overall  physical  layer  solution.  MIMO  must  be  designed  to 
operate  in-harmony  and  in  complementary  fashion  with  the  other  physical  layer  parameters.  As  such 
achieving  the  highest  possible  throughput  in  a  given  environmental  conditions  will  be  achieved  by  the 
proper  choice  of: 

•  MIMO  processing  (spatial  multiplexing,  or  space-time  coding,  or  diversity  processing,  or 
smart  antenna  processing) 

•  FEC  code  and  rate 

•  Constellation  size  and  type  of  modulation 

•  Signal  bandwidth 

•  Carrier  frequency 

Our  system  implements  all  four  variants  of  multi  antenna  techniques  (spatial  multiplexing,  space-time 
coding,  diversity  processing,  and  smart  antenna  processing).  In  an  attempt  to  get  as  close  to  the  Shannon 
capacity  as  possible  we  incorporate  advanced  LDPC  (low  density  parity  check)  codes.  Realizing  that  the 
power  of  LDPC  codes  come  at  the  price  of  decoder  complexity,  we  also  incorporate  bit  interleaved 
convolutional  codes  into  our  system  requirements  so  as  to  enable  us  to  operate  with  low  decoder 
complexity  over  good  channels.  To  combat  multipath  and  to  ensure  high  spectral  utilization  we  have 
adopted  an  OFDM  based  signaling  scheme.  Over  the  past  several  years  OFDM  has  become  the 
modulation  of  choice  for  both  wireline  and  wireless  communication  systems,  as  it  has  been  adopted  for 
DVB  standards,  wireless  LAN,  and  other  standards. 

Diversity  of  mission  requirements  dictate  a  high  degree  of  adaptability  in  the  physical  layer  and 
radio  architecture  itself.  Moreover,  the  widely  varying  conditions  of  the  wireless  channel  call  for  a  great 
degree  of  configurability  and  robustness  from  the  radio  unit.  Our  approach  to  addressing  the  diversity  of 
mission  requirements  is  to  incorporate  a  great  deal  of  adaptability  into  the  radio  system.  This  includes 
adaptability  in  the  modulation  format,  coding  rate,  bandwidth  carrier  frequency,  and  MIMO  processing. 
Underlying  the  adaptability  of  the  signal  bandwidth,  and  center  frequency  is  a  highly  agile  radio 
architecture  that  we  will  be  explored  as  part  of  the  proposed  work. 

High  Doppler  Communications  refers  to  the  case  where  the  rate  of  change  of  the  channel  is  high.  In 
general  the  Doppler  frequency,/,/,  is  defined  as  \fk  (X  is  the  wavelength  of  the  carrier  and  v  is  the  relative 
speed  between  the  transmitter  and  receiver),  represents  the  speed  at  which  the  channel  changes.  We 
overcome  the  limitations  of  extreme  mobility  through  pilot  symbol  assisted  modulation  technique. 
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Radio  form  factor  sufficiently  small  to  fit  Into  UAV  payload.  Although  the  issue  of  the  filial  size  of 
the  radio  is  not  directly  addressed  as  part  of  the  phase  1  effort,  it  is  nonetheless  an  important  issue  that 
must  be  kept  in  mind  in  developing  the  system.  As  part  of  the  phase  one  effort  we  searched  for  available 
processing  and  RF  platforms  with  an  eye  towards  being  able  to  fit  them  into  the  belly  of  a  small  UAV. 
The  processing  capability  of  such  platforms  might  dictate,  in  phase  II,  the  specific  MIMO  configurations 
that  might  be  supported. 

Over  the  course  of  the  one  year  period  of  this  project,  the  Silvus  and  UCLA  team  have  successfully 
completed  the  following  tasks: 

•  Matlab  MIMO-OFDM  Simulation  System. 

•  Design  and  simulation  of  Low  Density  Parity  Check  Code. 

•  Verification  with  Silvus  DSP  MIMO  Testbed. 

•  Design  and  Simulation  for  High  Mobility  Extension. 

•  Preparation  for  Real-Time  Implementation  on  FPGA. 

Each  item  will  be  reported  in  a  separate  section  hereafter. 


3.  Matlab  MIMO-OFDM  Simulation  System 

A  packet  structure  has  been  developed  to  support  realistic  end-to-end  physical  layer  packet  simulation. 
The  developed  packet  structure  is  highly  portable  and  support  interoperability  with  both  SISO  and  MIMO 
enabled  nodes.  Moreover  given  the  size  and  power  constraints  of  a  UAV  based  communication  system, 
there  might  be  times  when  the  desired  QoS  can  be  met  in  an  energy  efficient  SISO  mode.  The  packet 
structure  does  not  preclude  this.  Additionally,  we  strive  to  develop  the  packet  structure  in  such  a  way  as  to 
minimize  overhead,  effectively  increasing  goodput  as  compared  to  over  the  air  data  rate.  Figure  1  shows 
the  top-level  packet  structure.  It  consists  of  a  universal  SISO  header  that  is  understandable  by  all  systems. 
The  Mode  identifier  indicates  the  particular  MIMO  configuration  that  is  used  for  the  remainder  of  the 
packet,  and  the  receiver  dynamically  switches  to  the  proper  decoder  for  the  given  mode. 


ANT-1 


ANT-n 


SISO  AGO 

SISO  Chan 
Training 

Mode 

Identifier 

MIMO  AGC 

MIMO  Chan 
Estimation 

MIMO  DATA 

l  l 

SISO  AGO 


SISO  Chan 
Training 


Mode 

Identifier 


MIMOAGC 


MIMO  Chan 
Estimation 


MIMO  DATA 


Figure  1.  Physical  Layer  Frame  Structure 

The  packet  mainly  consists  of  six  fields.  The  first  three  components  only  require  one  transmit  and  one 
receive  antennas  to  work. 
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1.  SISO  AGO  (Automatic  Gain  Control)  Preamble.  The  SISO  AGC  field  is  intended  to  assist  the 
receiver  for  packet  detection,  gain  control,  and  optionally  coarse  frequency  and  symbol  timing 
synchronization. 

2.  SISO  Channel  Training.  The  SISO  Channel  Training  field  is  intended  to  assist  the  receiver  for 
symbol  timing,  carrier  frequency  synchronization,  and  channel  estimation. 

3.  Mode  Identifier.  The  Mode  Identifier  tells  the  receiver  the  format  of  the  packet  following  this 
field.  Table  1  lists  all  the  supported  modes  that  can  be  selected  in  the  Mode  Identifier.  The 
Mode  Identifier  is  protected  with  CRC  to  prevent  the  receiver  decoding  based  on  wrong 
assumptions. 


Table  1.  Supported  Modes  of  The  Packet  Structure 


Mode  Description 

Supported  Values 

Number  TX  Antennas 

1.2,  3,  4 

Number  of  Multiplexed  Spatial  Streams 

1,2,  3,4 

Space  Time  Block  Code 

Yes  or  No 

Constellation 

BPSK,  QPSK,  16QAM,  64QAM 

Channel  Coding 

Binary  Convolution  Code  or  LDPC 

Packet  Length 

0~2i6-1,  bytes 

Coding  Rate 

1/2, 2/3,  3/4,  5/6 

MIMO  Channel  Training  Length 

0  ~  4,  symbols 

High  Mobility  Support  Extension 

Yes  or  No 

Transmit  Beamforming 

Yes  or  No 

4.  MIMO  AGC  Field.  The  MIMO  AGC  Field  assists  the  receiver  to  adjust  receiver  gain  settings 
for  the  MIMO  section  of  the  packet.  This  field  is  not  transmitted  for  SISO  mode. 

5.  MIMO  Channel  Training  Field.  The  MIMO  Channel  Training  Field  assists  the  receiver  for 
MIMO  Channel  estimation  and  possibly  improvement  of  carrier  and  timing  synchronization. 
This  field  is  not  transmitted  for  SISO  mode. 

6.  Data  Field.  The  Data  field  contains  both  payload  and  pilots.  Two  different  pilot  schemes  are 
supported  for  best  tradeoff  between  bandwidth  efficiency  and  support  for  mobility.  In  the 
scenario  of  low  mobility,  low  order  constellation,  or  short  packet  length,  a  few  sub  carriers  are 
dedicated  to  transmit  pilots  to  assist  the  receiver  for  tracking  phases  and  frequency  offset. 
When  the  channel  changes  significantly  during  the  packet,  a  more  sophisticated  pilot  scheme 
is  used.  In  this  scheme,  pilot  symbols  are  spread  across  the  frequency-time  domain.  The 
receiver  could  use  these  pilot  symbols  to  estimate  and  track  channel  continuously.  For  detailed 
information  on  the  pilot  scheme  for  high  mobility,  refer  to  Section  Error!  Reference  source 
not  found.. 
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Using  the  packet  structure  described  above,  a  complete  simulation  system  has  been  efficiently 
implemented  in  Matlab.  Certain  functions  are  implemented  in  Mex  functions  using  C/C++  to  achieve 
good  simulation  speed.  The  implementation  is  highly  modularized  and  parameterized  for  easy  upgrade  as 
the  requirements  evolve  as  well  as  for  providing  a  simulation  environment  to  test  innovative  ideas.  The 
system  is  a  complete  packet  level  end-to-end  simulation  system,  including  packet  generation,  all  major 
hardware/RF  impairments,  sophisticated  MIMO  channel  model,  and  complete  reference  receiver.  Figure  2 
and  Figure  3  show  the  block  diagram  of  the  transmitter  and  the  receiver. 


Figure  2.  Functional  Block  Diagram  of  the  Transmitter 


Figure  3.  Functional  Block  Diagram  of  the  Receiver 

The  simulation  system  supports  the  following  features: 

•  Any  combination  of  antenna  configuration,  up  to  4  transmit  antennas  and  4  receive  antennas. 

•  Spatial  multiplexing  with  up  to  4  spatial  streams. 
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•  Space-time  block  codes. 

•  Spatial  cyclic  delay  diversity. 

•  Hybrid  space-time  and  spatial  multiplexing  system. 

•  Binary  convolutional  code  and  LDPCC  with  variable  coding  rate,  1/2,  2/3,  3/4  5/6. 

•  Transmit  per-subcarrier  beamforming. 

•  Space-frequency-time  3  dimensional  interleaver  for  maximum  coding  diversity  gain. 

•  Different  constellation  size:  BPSK,  QPSK,  16QAM,  64QAM. 

•  Soft/hard  decision  Viterbi  decoder. 

•  Layered  fast  LDPCC  decoder. 

•  Versatile  channel  model  with  programmable  delay  and  power  profile  and  Doppler  spread. 

•  Programmable  hardware  and  RF  impairments  such  as  carrier  frequency  offset,  phase  noise, 
power  amplifier  non-linearity,  and  I/Q  imbalance. 

•  Fixed  point  model  for  key  components  such  as  MIMO  detection  and  synchronization. 


All  these  features  can  be  easily  controlled  by  toggling  flags  and/or  setting  parameters.  For  instance,  most 
part  of  the  receiver  could  be  instructed  to  use  perfect  information  to  assess  the  implementation  loss  on  an 
individual  module  basis.  The  simulation  system  also  features  a  complete  low-complexity  high 
performance  Silvus  proprietary  receiver  that  can  serve  as  a  design  for  real-time  implementation.  The 
simulation  platform  that  the  Silvus-UCLA  team  developed  has  become  a  valuable  tool  for  trying  out 
innovative  ideas.  The  rest  of  the  section  presents  some  of  the  results  obtained  using  this  simulation 
platform. 

Simulation  results 

The  results  presented  here  are  the  results  of  full  system  simulations  in  various  channel  scenarios.  It  is  a 
complete  physical  layer  end  to  end  packet  simulation  which  includes  all  the  necessary  transmitter  and 
receiver  algorithms.  Major  hardware  impairments  are  also  included.  All  algorithms  are  practical  in  terms 
of  hardware  implementation.  In  fact,  if  a  real-time  implementation  is  required,  all  those  algorithms  can  be 
directly  translated  into  fixed  point  implementation  and  mapped  onto  hardware. 

The  following  is  a  list  of  important  parameters  of  the  simulations  that  were  reported  here. 

•  4  transmit  antennas  with  4  spatial  data  streams 

•  4  receive  antennas 

•  20MHz  bandwidth 

•  2.4GHz  carrier  frequency 

•  Sub-carrier  spacing  31 2.5KHz 

•  56  effective  carriers,  52  data  carriers  +  4  pilot  carriers 

•  Bit  interleaved  coded  modulation  with  binary  convolution  code  and  QPSK,  16QAM,  64QAM 

•  Jakes  model[21]  is  used  for  simulating  time  varying  fading 

•  Frequency  selective  multi-path  fading  model  is  based  on  temporal  multi-clustering  model 
proposed  in  [22].  It  has  an  exponential  delay  and  power  profile 
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The  following  hardware  impairments  were  introduced  in  the  simulation. 

•  Phase  noise  on  both  TX  and  RX  sides.  The  phase  noise  model  for  the  IEEE  next  generation 
wireless  LAN  standard  is  used.  It  represents  a  phase  noise  model  one  would  observe  in  a  typical 
inexpensive  commercial  grade  system.  Refer  to  [19]  for  the  details  of  this  model. 

•  Nonlinearity  of  power  amplifiers.  The  model  for  the  IEEE  802.1  lb  is  used.  An  output  power 
backoff  lOdB  is  used  in  the  simulation.  Refer  to  [20]  for  details. 

•  Carrier  frequency  offset.  A  frequency  offset  of  1 30  KHz  is  used  for  all  simulations. 

The  following  list  describes  the  important  features  of  the  receiver  algorithm. 

•  A  two-stage  packet  detection  and  OFDM  symbol  timing  algorithms  are  used.  First  stage 
algorithm  keeps  searching  for  the  SISO  AGC  preamble  in  the  incoming  signal.  It  has  a  very  low 
complexity.  Once  it  finds  the  preamble,  a  packet  is  declared  to  be  found.  The  second  stage  uses 
the  SISO  Channel  Training  field  to  fine  tune  OFDM  symbol  timing.  Carrier  frequency  estimation, 
noise  variance  estimation,  and  SISO  channel  estimation  are  done  at  the  same  stage. 

•  MIMO  Channel  estimation  is  done  once  using  the  MIMO  Channel  training  field.  There  is  no 
channel  tracking  afterwards.  See  Section  Error!  Reference  source  not  found,  for  our  simulation 
results  with  channel  tracking. 

•  MIMO  Detection  is  based  on  the  principle  of  linear  minimum  mean  square  error  (LMMSE) 
detection. 

•  A  fast  QAM  soft  demapper  with  complexity  linear  to  the  number  of  bits  is  developed  and  used  in 
the  simulation. 

•  Phase  noise  tracking  is  performed  on  a  symbol  by  symbol  basis  using  the  embedded  pilot  carriers. 

•  A  soft  input  sliding  window  Viterbi  algorithm  is  used  to  decode  convolution  code. 

The  simulations  aim  to  examine  the  performance  of  the  whole  system  from  three  different 

perspectives. 

•  Performance  as  a  function  of  Doppler  spread  for  QPSK,  1 6QAM,  and  64QAM  at  a  speed  of 
5MPH,  1  OOMPH,  and  500MPH.  500MPH  is  not  simulated  for  16QAM  and  64QAM  as  the 
performance  is  not  acceptable  even  with  ideal  synchronization  and  no  RF  impairments  (See 
Figure  5  and  Figure  7).  A  flat  fading  channel  is  chosen  in  this  case. 

•  Performance  as  a  function  of  delay  spread  at  a  RMS  delay  spread  of  0ns,  15ns,  50ns,  and  150ns. 
16QAM  and  5MPH  is  used  for  this  simulation. 

•  Performance  as  a  function  of  packet  length  for  QPSK.  The  speed  is  set  to  1  OOMPH 

The  results  are  summarized  in  the  following  plots  and  paragraphs.  As  a  comparison,  part  of  the  results 

for  ideal  synchronization  and  no  RF  impairments  are  also  included. 
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Performance  vs.  Speed 


Figure  4.  Packet  Error  Rate  for  QPSK  at  Different  Speed 


Figure  5.  Performance  16QAM  with  Ideal  Synchronization  and  No  Hardware  Impairments 
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SNR  per  RX,  dB 

Figure  7.  Performance  of  64QAM  with  Ideal  Synchronization  and  No  Hardware  Impairments 
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Figure  8.  Packet  Error  Rate  for  64QAM  at  Different  Speed 

Figure  4,  Figure  6,  and  Figure  8  show  the  performance  vs.  speed  simulation  results.  These  results  show  that  if 
the  required  packet  error  rate  is  10%  with  100-Byte  long  packets,  QPSK  is  applicable  at  a  speed  more  than 
500MPH,  16QAM  is  applicable  at  a  speed  more  than  1  OOMPH,  and  64QAM  requires  42dB  SNR  at  a  speed  of 
100MPH.  The  RF  impairments  and  practical  synchronization  algorithm  introduces  an  approximate  loss  of  4dB,  5dB, 
and  12dB  at  10%  PER  and  100MPH,  for  QPSK,  16QAM,  64QAM  respectively.  It  is  clear  that  if  high  constellation 
such  as  64QAM  is  desired  at  high  speed,  the  receiver  needs  to  perform  channel  tracking  to  combat  the  time-varying 
fading. 

Performance  vs.  Delay  Spread 


Figure  9.  Packet  Error  Rate  for  16QAM  with  Different  Delay  Spread 
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Figure  9  shows  the  performance  of  the  system  using  16QAM  with  practical  receiver  and  RF 
impairments.  Interestingly,  the  diversity  effect  that  15  ns  delay  spread  offers  does  not  exhibit  itself  for 
low  PER.  However,  a  larger  delay  spread  such  as  50ns  or  150ns  offers  significant  advantage  over  flat 
fading.  For  instance,  50  ns  delay  spread  offers  about  7dB  gain  at  10%  PER  compared  to  flat  fading.  The 
synchronization  and  RF  impairments  introduces  about  4dB  loss. 

Performance  vs.  Packet  Length 


Figure  10.  Packet  Error  Rate  of  QPSK  with  Different  Packet  Size 

Figure  10  shows  that  if  10%  is  the  target  packet  error  rate,  800  bytes  is  the  maximum  packet  size  the  simulated 
system  supports.  A  significant  error  floor  is  observed  when  the  packet  length  is  above  400  bytes. 

Conclusion  Drawn  from  the  Simulation  Results 

With  QPSK,  the  current  MIMO-OFDM  system  works  reasonably  well  at  a  speed  less  than  1  OOMPH. 
For  short  packet  communication,  even  higher  constellation  such  as  16QAM  and  64QAM  or  higher  speed 
such  as  500MPH  is  possible.  If  more  bandwidth  efficiency  and/or  higher  speed  are  desired,  a  channel 
tracking  algorithm  designed  for  time  varying  channel  is  required.  This  work  will  be  reported  in  Section 
Error!  Reference  source  not  found..  ’ 


4.  Design  and  Simulation  of  Low  Density  Parity  Check 
Code 

Low  density  parity  check  codes  are  linear  binary  block  codes  with  parity  check  matrices  containing 
mostly  zeros  and  only  small  number  of  ones.  LDPC  codes  can  be  described  by  an  MxN  parity  check 
matrix,  H,  or  via  a  graphical  representation  called  bipartite  graph.  M  rows  of  parity  check  matrix  specify 
each  of  the  M  constraints  on  codeword  bits  and  N  columns  define  the  codeword  length.  Similarly, 


Final  Report:  Throughput  Optimization  via  adaptive  MIMO 


13 


•  c« 


Contract  No.  FA9550-05-C-0103 


bipartite  graph  contains  N  bit  nodes ,  one  for  each  bit  (column  of  H)  and  M  check  nodes ,  one  for  each  of 
the  parity  checks  (row  of  H).  Figure  1 1  illustrates  the  parity  check  matrix,  H,  for  a  simple  (7,  3)  code  and 
corresponding  bipartite  graph,  which  provides  a  graphical  representation  of  the  parity  check  matrix  and 
assists  in  the  understanding  of  the  iterative  soft  decoding  algorithm.  In  the  Bipartite  graph  (Figure  1 1)  of  a 
(7,  3)  code,  bit  nodes  are  denoted  using  circles  and  check  nodes  are  denoted  using  squares.  The  check 
nodes  are  connected  to  bit  nodes  they  check  or  in  other  words  check  node  j  is  connected  to  a  bit  node  i 
whenever  element  fy,  in  H  is  a  1 . 


H 


11110 
1  0  0  0  1 
110  10 
0  0  0  0  1 


0 

1 

0 

1 


Variable  eta** 


Figure  11,  Bipartite  graph  of  a  (7,3)  code 


In  the  following,  three  variations  of  the  LDPC  decoding  algorithm,  namely  sum-product,  Offset  Min- 
Sum,  and  Layered  Decoding  algorithms  are  discussed  and  corresponding  simulation  results  are  presented. 


Sum  Product  Algorithm 


The  sum-product  decoding  algorithm  (SPA)  [23][24]  works  iteratively  by  passing  messages  on  the 
edges  of  the  associated  bipartite  graph.  The  messages  are  the  Log-Likelihood  Ratios  (LLRs),  where  the 
sign  of  the  message  represent  the  binary  digit  and  the  magnitude  denotes  the  reliability  of  the  message. 


Before  describing  the  sum-product  algorithm,  the  notation  is  introduced.  The  set  V(j)={ i:  Hj.pl} 

defines  the  bit  nodes  that  are  connected  to  check  j  and  the  set  of  check  nodes  that  are  connected  to  bit  i  is 
denoted  as  //(i)={j:  Hj.pl}.  A  set  V(j)  with  bit  i  excluded  is  referred  by  V(j)\i  and  a  set  fl(i)  with 
check  j  excluded  is  denoted  by  fl(i)  \  j .  Qy  defines  a  message  sent  from  bit  node  i  to  check  node  j  and 
Rj  i  refer  to  the  message  that  is  passed  from  check  node  j  to  bit  node  i.  The  sum-product  algorithm  starts 
with  an  initialization  step  and  then  iterations  continue  by  exchanging  messages  between  bit  and  check 
nodes.  Decoding  is  stopped  when  all  the  parities  are  satisfied  or  a  maximum  number  of  iterations  are 
reached.  The  main  steps  of  the  decoding  are  summarized  as  follows. 


1st  Step:  Initialization:  Each  bit  node  is  assigned  a  posteriori  log-likelihood  ratio, 


£,  =  In 


P,  (1) 
P,(  0) 


=  2* 
o2 


where  pt  (1)  and  pi  (1)  represents  probability  of  being  1  and  0  for  bit  i,  is  the  received  soft  bit  values, 
and  a2  is  the  variance  of  the  channel  respectively. 


2nd  Step:  Check  Node  Operation:  The  expression  for  check  node  to  bit  node  messages,  R , ,  is 


calculated  according  to  (1) 

in-  '* 


R„  = 


V.,«VA  )  V,el> 


(D- 
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where 


e'  +  l 


Otu  =  signiQ^)  P,t  =  £,  (£(*)  =  log  — — - 

'  e  —l 

3rd  Step:  Bit  Node  Operation  :  Bit  node  to  check  node  messages  are  estimated  based  on  (2). 


Qu  =  1" 


ML 

MO) 


2>.<  -*» 


(2) 


In  this  step,  the  soft  decision  values,  Qi  s,  are  also  estimated  from  (3),  which  are  then  used  in  (4). 


Qi=Jm]JyR 

A  (0)  J  Lj  ■ 

\  J*Pi 


(3) 


4th  Step:  Syndrome  Check:  The  parity  check  operations,  cHr ,  is  performed  based  on  the  hard 
decision  variables,  c(. ,  from  equation  (4). 


1  if  j2,<0 
0  otherwise 


(4) 


If  cHt  =  0  or  the  number  of  iterations  equals  the  maximum  limit  then  the  decoder  is  stopped  else  the 


decoder  goes  back  to  2nd  step  and  continues  iterating. 


Offset  Min-Sum  Algorithm 

The  SPA  is  the  best  performing,  yet  the  most  complex  algorithm  for  the  decoding  of  LDPC  codes.  In 
the  last  decade,  various  complexity  reduction  schemes  for  decoding  of  LDPC  codes  have  been  studied. 
One  promising  reduced  complexity  decoding  scheme  is  the  Min-Sum  algorithm  [25] [26],  which  do  not 
require  any  channel  state  information  and  involve  only  addition  and  compare  operations.  In  order  to 
improve  the  accuracy  of  the  check  node  operation  in  the  Min-Sum  algorithm,  the  output  reliability  values 
can  be  reduced  by  a  positive  constant  rj.  This  approach  is  called  Offset  Min-Sum  (OMS)  algorithm  and 
can  be  simply  implemented  by  replacing  Equation  (1)  of  SPA  with  (5). 


Layered  Decoding  Algorithm 

Another  promising  variation  to  SPA  can  be  obtained  by  using  a  modified  message  processing 
schedule,  called  layered  decoding  [27].  In  this  approach,  an  LDPC  code  is  viewed  as  a  code  concatenated 
from  m  constituent  codes  or  layers.  Consequently,  single  LDPC  decoder  iteration  consists  of  m  successive 
sub-iterations  performed  by  each  constituent  code,  and  updated  messages  from  previous  constituent  code 
are  passed  to  the  next  constituent  codes  to  be  processed  in  the  next  sub-iteration.  As  opposed  to  the 
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standard  SPA,  this  technique  allows  utilization  of  the  updated  messages  more  quickly  in  the  algorithm 
and  leads  to  faster  convergence  speeds  in  the  LDPC  decoder.  Furthermore,  processing  messages  in  layers 
reduces  the  memory  requirement  per  decoding  iteration. 

Figure  12  shows  a  two  layer  parity  check  matrix  for  a  (12,  4)  LDPC  code.  Rows  of  the  parity  check 
matrix  are  grouped  into  non-overlapping  subsets  such  that  each  column  of  this  subset  has  at  most  a  weight 
of  one. 


]  ()  0  0  0  (I  0  I  0  10  0 

0  10  0  1  0  0  0  0  0  I  0 

0  0  I  0  0  I  0  0  II  0  0  I 

oooi  ooioiooo 
001000  0  0  1000 
00  0  1  0  0  000100 
100000000010 
010000000001 


Figure  12,  Layered  parity  check  matrix  for  a  (12, 4)  LDPC  code 


The  layered  decoding  algorithm  can  be  independently  applied  to  SPA  or  OMS.  Layered  sum-product 
algorithm  (LSPA)  is  a  simple  variation  of  (l)-(3)  and  given  in  (6)-(8).  In  this  case,  first  all  Q.  is 
2y. 

initialized  to  — f- .  Then,  for  all  i  (bit  node)  in  the  layer  k  of  the  rows,  (6)-(8)  is  repeated  for  one  layer 


after  another. 


Qij  ~  Qj  ~  Rji 


Rj.= 


n  a‘j  i 


v ^ 


\,e*  A* 


Qj  -  Qn  +  Rji 

In  the  case  of  layered  offset  min-sum  (LOMS)  algorithm,  equation  (7)  is  replaced  by  (5). 


(6) 

(7) 

(8) 


Simulation  Results 

In  the  following,  BER  comparisons  of  SPA,  OMS,  and  LOMS  Decoding  algorithms  are  provided  for  a 
(1728,  864)  LDPC  code  [27].  In  the  simulations,  AWGN  channel  with  zero  mean  and  variance  Ny/  is 

assumed  and  BPSK  modulation  is  employed.  Simulations  are  carried  out  until  100  word  errors  are 
collected  for  each  SNR  point. 

In  Figure  14,  floating  point  BER  performance  of  the  sum-product  algorithm  and  OMS  algorithm  with 
various  iterations  are  illustrated.  As  observed,  increasing  the  number  of  maximum  decoding  iterations  of 
SPA  from  8  to  16  provides  around  1  dB  performance  improvement  while  only  0.25  dB  improvement  is 
gained  by  doubling  the  decoding  iterations  from  16  to  32.  In  comparison  of  OMS  to  SPA,  although 
reduced  complexity  OMS  decoding  algorithm  introduces  slight  performance  loss  at  low  SNR  values,  it 
performs  0.1  db  better  compared  to  SPA  at  BER  of  10"6. 
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Figure  13,  BER  performance  of  (1728,864)  LDPC  code  using  SPA  and  OMS  algorithms 


Figure  14,  compares  BER  performance  of  SPA  with  16  iterations  to  Layered  OMS  algorithm  with  8 
and  16  iterations.  As  seen,  LOMS  algorithm  achieves  approximately  the  same  error  rate  performance  of 
SPA  with  only  half  of  the  number  of  decoding  iterations. 


Figure  14,  BER  and  WER  performance  comparison 

A  question  that  remains  to  be  answered  is:  Is  the  benefit  of  LDPC  worth  the  extra  complexity  as 
compared  to  convolution  code?  To  answer  this  question,  we  need  to  design  and  architect  a  real-time 
implementation  of  LDPC  code  and  compare  the  performance  and  complexity  with  the  convolution 
code.  We  will  leave  this  task  in  the  Phase  II  of  this  effort. 
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5.  Verification  with  Silvus  DSP  MIMO  Testbed 


The  designed  MIMO-OFDM  system  has  been  tested  in  real  channels  on  an  existing  Silvus  2x2  MIMO  DSP 
testbed.  Figure  15  shows  a  system  block  diagram  of  the  testbed.  The  testbed  operates  in  the  915MHz  ISM  band  with 
26  MHz  bandwidth.  The  analog  part  is  implemented  using  COTS  integrated  radios.  Communications  algorithms 
could  be  implemented  on  the  TI6416  DSP  or  on  the  host.  The  entire  testbed  is  hosted  in  two  compact  PCI  (cPCI) 
chassis.  For  the  test  that  has  been  conducted  for  this  project,  the  same  Matlab  simulation  code  is  reused  to  generate 
and  decode  packets.  The  DSP  is  responsible  for  sending  the  generated  packet  to  the  DACs  and  acquiring  data  from 
the  ADCs  into  the  on  board  memory.  The  packet  decoding  is  done  in  a  non  real-time  fashion.  However,  the  signal 
goes  through  actual  channels  and  RF  impairments.  A  graphical  user  interface  (GUI)  was  developed  to  monitor  the 
field  test  and  collect  results.  Figure  16  shows  a  snapshot  of  the  GUI  during  a  test.  The  GUI  provides  instant 
information  such  as  the  packet  error  rate,  channel  singular  values,  SNR,  etc.,  as  soon  as  the  received  packet  is 
decoded.  It  also  enables  users  to  observe  the  constellation,  signal  waveforms,  and  other  intermediate  receiver  results. 
The  designed  MIMO-OFDM  system  has  been  demonstrated  working  on  this  testbed.  Both  throughput  increase  by 
using  spatial  multiplexing  and  diversity  gain  by  using  STBC  were  successfully  demonstrated.  For  instance,  it  is 
clearly  shown  in  the  screenshot  that  by  using  STBC,  much  more  reliable  communications  were  achieved  for 
16QAM.  In  fact,  in  that  particular  test,  there  was  no  error  for  STBC  coded  16QAM  and  less  errors  for  spatial 
multiplexed  16QAM.  The  SNR  is  around  18dB. 


Figure  15.  System  Block  Diagram  of  Silvus  DSP  Testbed 
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Figure  16.  A  Screenshot  of  the  Field  Test  GUI 


6.  Design  and  Simulation  for  High  Mobility  Extension 

To  be  able  to  track  the  highly  dynamic  changing  channel  environment  like  the  one  experienced  by  UAV 
is  a  complicated  one.  Due  to  frequency  selectivity  in  frequency  domain  and  high  mobility  of  mobile 
reception  in  high  Doppler  communication  environment,  the  channel  suffers  from  both  frequency  and  time 
dispersion.  As  a  consequence  of  the  rapidly  time-varying  channel,  more  pilot  symbol  (PS)  are  expected  in 
time  domain.  It  is  necessary  to  sample  the  two-dimensional  space  (i.e.  Frequency  and  Time)  at  greater 
than  Nyquist  rate  of  the  channel  process.  To  perform  the  channel  estimation/tracking  in  a  high  mobile 
environment,  a  new  data-field  structure  which  is  known  as  2D  checker  board  pattern  pilot  symbol  assisted 
modulation  (PSAM)  is  designed  and  shown  in  Figure  16. 
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MIMO-OFDM  PSAM  PacketStructure 


Figure  16,  Data-Field  PSAM  Packet  Structure 

In  practice,  known  symbols  (Pilot  Symbols,  PS)  are  inserted  at  transmitter  and  channel  estimate  is 
acquired  by  interpolation.  The  channel  estimator  consists  of  linear  combinations  of  the  observations  at  the 
PS  locations.  The  simplest  and  highest  performing  way  to  process  and  estimate  channel  information  from 
MIMO  environment  is  to  use  orthogonal  modulation  on  each  of  the  transmit  antennas.  Denote  xp  be 

observation  of  pilot  value  at  the  receiver  and  c  be  position  of  data  symbols  (DS).  Due  to  orthogonality, 
channel  estimates  could  be  obtained  by: 

c  =  £[cxpH]C0v(xJ  1  xp  =  Wxf 

Where  Cov(xp)  is  the  covariance  matrix  of  PS  and  W  is  the  Wiener  Filter  coefficients.  The  current 

implementation  of  MIMO-OFDM  PSAM  Packet  channel  estimator  in  Silvus  software  simulator  has  a 
particular  OFDM  packet  structure  which  contains  12  OFDM  symbols,  as  shown  in  Figure  16.  This 
particular  PS  placement  in  the  frequency-time  grid  enables  the  system  to  track  a  frequency  roll  across  the 
frame.  In  some  situation,  the  optimum  interpolation  filter  from  this  sampling  of  the  noisy  channel 
response  is  a  linear  filter  whose  tap  coefficients  are  a  function  of  particular  channel  statistics.  Therefore, 
the  Wiener  filter  coefficients  are  often  pre-computed  and  results  in  an  open  loop  estimation  structure 
which  has  no  acquisition  time.  Performance  of  this  particular  MIMO-OFDM  PSAM  channel  estimation 
scheme  is  simulated  and  the  results  are  presented  below. 
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SNR,  dB 


Figure  19,  Packet  Error  Rate  for  64QAM  Rate-1/2  BICM  at  500MPH,  Channel  D 

In  our  simulations,  we  assume  an  equal  number  of  transmit  and  receive  antennas  (i.e.  4x4  system  with  4 
spatial  streams).  Most  of  OFDM-PHY  parameters  are  compatible  with  the  on-going  IEEE  802.1  In  Next 
Generation  WLAN  proposal  EWC  PHY  spec,  vl.13.  In  particular,  we  assume  the  following: 

•  20MHz  bandwidth 

•  2.4GHz  carrier  frequency 

•  Sub-carrier  spacing  3 1 2.5KHz 

•  Bit  interleaved  coded  modulation  (BICM)  with  binary  convolution  code  (BCC)  and  QPSK, 
16QAM  and  64QAM 

•  Time  Varying  (TV)  Jakes  model  is  used  for  simulation  both  frequency-flat  (FF)  or  frequency- 
selective  (FS)  fading 

•  Terminal  mobility  of  500MPH  (i.e.  0.72%  Normalized  Doppler) 

•  Pre-Computed  Wiener  Filter  Coefficient  designed  at  20dB  SNR  and  0.72%  Normalized  Doppler 
We  compute  the  packet  error  rate  (PER).  Each  packet  consists  of  1536  symbols.  Depending  of 
modulation  and  coding  schemes  (MCS),  flexible  number  of  information  bytes  can  be  sent.  We  further 
assume  perfect  timing  synchronization  and  no  frequency  offset. 

Figure  17  presents  a  PER  performance  comparison  between  perfect  channel  state  information  (PCSI)  and 
PSAM  of  16QAM  Rate-1/2  BICM  under  Channel-D  which  is  a  Non-Light  of  Sight  FS  channel  with 
500MPH.  At  1%  PER,  we  observe  that  PSAM  is  only  ~1.5dB  away  from  PCSI.  With  the  proposed  packet 
structure,  we  are  able  to  re-construct  the  channel  state  information  at  receiver  side  fairly  accurate  even 
under  a  high  terminal  speed  of  500MPH. 
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Figure  18  presents  a  PER  performance  comparison  between  PCSI  and  PSAM  of  16QAM  Rate-1/2  BICM 
under  Channel-A  which  is  a  FF  channel  also  with  500MPH.  In  this  case,  PSAM  has  performance  very 
close  to  PCSI  with  degradation  no  more  than  0.2dB.  This  is  due  to  the  fact  that  the  channel  is  frequency 
flat  and  only  varying  in  time  because  of  high  Doppler,  so  it  transforms  the  original  2D  frequency-time 
interpolation  into  a  single  dimension  (i.e.  time)  interpolation.  It  is  quite  obvious  that  with  much  more 
resolution  in  time  and  PSAM  performance  gets  better. 

Figure  19  presents  a  PER  performance  comparison  between  PCSI  and  PSAM  of  64QAM  Rate- 1/2  BICM 
under  Channel-D  which  is  a  Non-Light  of  Sight  FS  channel  with  500MPH.  In  this  case,  we  observe  that 
PSAM  has  an  error-floor  at  about  10%  PER.  The  reason  for  the  error  floor  is  two  fold.  In  general,  higher 
constellation  requires  not  only  higher  SNR  but  also  more  sensitive  to  channel  estimation  errors.  This  is 
because  a  denser  grid  on  the  constellation  plane  reduces  the  pair-wise  Euclidean  distance.  On  the  other 
hand,  the  current  packet  design  assumes  a  pilot-tone  placement  of  4  x  4  space-time  block  of  Wash- 
Hadamard  orthogonal  sequence.  At  500MPH  terminal  mobility  which  corresponds  to  normalized  Doppler 
of  0.72%,  the  orthogonality  assumption  is  broken  and  results  in  an  error  floor.  To  further  improve  the 
performance  of  64QAM,  one  would  need  to  use  a  different  pilot  symbol  pattern. 

Our  preliminary  results  show  that  reliable  communication  with  4x4  16QAM  in  high  mobility  could  be 
achieved  with  time-frequency  domain  channel  tracking.  The  problem  is  more  challenging  with  64QAM. 
We  leave  this  problem  in  the  Phase  II  effort  when  we  would  optimize  the  pilot  symbol  values  and 
placement. 


7.  Preparation  for  Real-time  Implementation  on  FPGA 

The  Silvus-UCLA  also  investigated  on  a  possible  real-time  implementation.  A  real-time 
implementation  offers  the  following  benefits: 

•  It  is  closest  to  an  actually  deployed  system  and  predicts  the  achievable  performance  more 
accurately. 

•  It  enables  more  extensive  field  test. 

•  It  enables  field  test  in  a  networked  environment. 

•  It  enables  field  test  of  certain  PHY  algorithm  such  as  feedback  MIMO,  which  is  otherwise 
impossible  or  not  accurate.  Feedback  MIMO  could  significantly  improve  the  system  capacity 
and  anti-jam  performance. 

Two  important  tasks  have  been  Finished:  identifying  the  development  platform  and  mapping  key 
floating  point  algorithms  to  fixed-point  algorithms. 

•  Silvus  team  conducted  extensive  search  for  available  COTS  FPGA  development  platform  and 
have  identified  a  set  of  FPGA  boards  from  Nallatech  (www.nallatech.com)  that  offers  enough 
processing  power  for  an  advanced  4x4  MIMO  communication  system  with  40MHz 
bandwidth.  The  identified  FPGA  development  platform  is  in  compact  PCI  form  factor  and 
supports  a  scalable  architecture.  Up  to  4  high  performances  Xilinx  FPGA  and  8  DACs  and 
ADCs  can  be  supported  on  a  single  cPCI  board.  This  platform  provides  enough  processing 
performance  as  well  as  a  simple  analog  baseband  l/Q  interface.  A  complete  4x4  MIMO 
communication  system  could  be  easily  put  together  by  using  using  a  COTS  RF  product  such 
as  MAX2829. 


Fixed  point  architecture  of  major  receiver  blocks  has  been  defined.  Fixed  point  algorithm  has 
been  designed  and  implemented  for  most  receiver  algorithms  that  involve  floating  point 
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arithmetic  operations.  The  most  critical  part  of  the  receiver  is  perhaps  the  MIMO  detection. 
Here  we  present  simulation  results  of  our  fixed  point  MMSE  MIMO  detection. 


Figure  17.  Performance  of  Fixed  Point  MMSE  MIMO  Detection 


Our  fixed  point  MMSE  MIMO  detection  is  implemented  based  on  QR  decomposition  followed  by  a  set  of 
matrix  vector  multiplications.  Figure  17  shows  the  performance  of  the  fixed  point  implementation  in  a 
4x4  64QAM  system  with  rate  2/3  convolution  code.  This  system  achieves  an  information  bit  rate  of 
192Mbps  over  a  20  MHz  bandwidth  and  puts  very  strong  requirement  on  the  accuracy  of  the  MIMO 
detection.  The  number  of  bits  indicated  in  the  plot  is  the  number  of  bits  used  during  the  QR 
decomposition  process.  The  matrix  vector  multiplication  can  be  implemented  at  a  less  number  of  bits 
without  noticeable  performance  loss.  With  16  bit  implementation,  the  performance  is  almost  identical 
with  floating  implementation  for  PER  above  1%  and  suffers  only  0.5  dB  loss  at  0.1%  PER.  The  14  bit 
implementation  lose  more  at  high  PER  but  is  a  good  choice  if  PER  requirement  is  low. 


8.  Conclusion 

During  the  Phase  I  period  of  this  contract,  we  have  successfully  designed  a  MIMO-OFDM  packet 
structure  that  supports  non-line-of-sight  communication  in  an  urban  warfare.  The  packet  structure 
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supports  a  variety  of  features  that  are  configurable  on  a  packet  by  packet  basis.  Those  configurable 
features  offer  the  possibility  of  optimum  bandwidth  efficiency,  reliability,  and  complexity  tradeoff  in  a 
diverse  warfare  environment.  A  complete  end-to-end  physical  layer  simulation  platform  has  been 
constructed  in  Matlab.  The  feasibility  of  the  developed  packet  structure  and  receiver  algorithm  has  been 
verified  on  the  Silvus  DSP  MIMO  Testbed.  We  have  also  finished  fixed  point  implementation  for  all  the 
key  modules  of  the  receiver  and  identified  a  cPCI  based  FPGA  platform  for  real-time  implementation. 

The  work  that  has  been  completed  in  this  period  laid  a  solid  foundation  for  Phase  II.  In  other  words,  we 
are  fully  prepared  to  implement  on  a  FPGA  platform  a  real-time  MIMO  packet  communication  system 
that  meets  the  need  of  non-line-of-sight  urban  warfare  communications. 
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